Data-Model Relationship in Text-Independent Speaker Recognition
نویسندگان
چکیده
Text-independent speaker recognition systems such as those based on Gaussian mixture models (GMMs) do not include time sequence information (TSI) within the model itself. The level of importance of TSI in speaker recognition is an interesting question and one addressed in this paper. Recent works has shown that the utilisation of higher-level information such as idiolect, pronunciation, and prosodics can be useful in reducing speaker recognition error rates. In accordance with these developments, the aim of this paper is to show that as more data becomes available, the basic GMM can be enhanced by utilising TSI, even in a text-independent mode. This paper presents experimental work incorporating TSI into the conventional GMM. The resulting system, known as the segmental mixture model (SMM), embeds dynamic time warping (DTW) into a GMM framework. Results are presented on the 2000-speaker SpeechDat Welsh database which show improved speaker recognition performance with the SMM.
منابع مشابه
Evaluation of a small-footprint text and language independent speaker recognition system on forensic data
In this paper we evaluate on a forensic task our text and language independent speaker recognition system, characterized by modest memory requirements and robustness to environment noise. Noise robustness is achieved by employing a Kalman filter-based sequential interacting multiple models (SIMM) algorithm. The evaluation data was provided by the Netherlands Forensic Institute (NFI) and consist...
متن کاملText-independent speaker recognition by speaker-specific GMM and speaker adapted syllable-based HMM
We present a new text-independent speaker recognition method by combining speaker-specific Gaussian Mixture Model(GMM) with syllable-based HMM adapted by MLLR or MAP. The robustness of this speaker recognition method for speaking style’s change was evaluated. The speaker identification experiment using NTT database which consists of sentences data uttered at three speed modes (normal, fast and ...
متن کاملPhonetic, idiolectal and acoustic speaker recognition
This paper describes a text-independent speaker recognition system that achieves an equal error rate of less than 1% by combining phonetic, idiolect, and acoustic features. The phonetic system is a novel language-independent speakerrecognition system based on differences among speakers in dynamic realization of phonetic features (i.e., pronunciation), rather than spectral differences in voice q...
متن کاملText-constrained speaker recognition on a text-independent task
We present an approach to speaker recognition in the textindependent domain of conversational telephone speech using a text-constrained system designed to employ select highfrequency keywords in the speech stream. The system uses speaker word models generated via Hidden Markov Models (HMMs) — a departure from the traditional Gaussian Mixture Model (GMM) approach dominant in text-independent wor...
متن کاملText-Independent Speaker Verification via State Alignment
To model the speech utterance at a finer granularity, this paper presents a novel state-alignment based supervector modeling method for text-independent speaker verification, which takes advantage of state-alignment method used in hidden Markov model (HMM) based acoustic modeling in speech recognition. By this way, the proposed modeling method can convert a text-independent speaker verification...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- EURASIP J. Adv. Sig. Proc.
دوره 2005 شماره
صفحات -
تاریخ انتشار 2005